An NN-based Approach to Prosodic for Synthesizing English Words Em
نویسندگان
چکیده
In this paper, a neural network-based approach to generating proper prosodic information for spelling/reading English words embedded in background Chinese texts is discussed. It expands an existing RNN-based prosodic information generator for Mandarin TTS to an RNN-MLP scheme for Mandarin-English mixed-lingual TTS. It first treats each English word as a Chinese word and uses the RNN, trained for Mandarin TTS, to generate a set of initial prosodic information for each syllable of the English word. It then refines the initial prosodic information by using additional MLPs. The resulting prosodic information is expected to be appropriate for English-word synthesis as well as to match well with that of the background Mandarin speech. Experimental results showed that the proposed RNN-MLP scheme performed very well. For English word spelling/reading, RMSEs of 41.8/78.2 ms, 30.8/26 ms, 0.65/0.45 ms/frame, and 3.06/4.9 dB were achieved in the open tests for the synthesized syllable duration, inter-syllable pause duration, pitch contour, and energy level, respectively. So it is a promising approach.
منابع مشابه
An NN-based Approach to Prosody Generation for English Word Spelling in English-Chinese Bilingual TTS
In this paper, an RNN-MLP-based scheme to generate proper prosodic information for spelling English words embedded in Chinese text background is proposed. It is extended from the RNN prosody synthesis scheme of an existing Mandarin TTS by adding four MLPs to follow the RNN. It first treats each English word as a Chinese word and uses the RNN to generate eight prosodic parameters for each alphab...
متن کاملIdentification of Organizational Culture Components Based on Islamic – Iranian Values: A Field Literature Review with Synthesizing Approach
Organizational culture is defined as prominent values and a set of key characteristics govern the organization. Paying attention to the importance of organizational culture increases staff’s productivity and job satisfaction. Therefore, the aim of this study was identification, counting and classification of organizational culture components based on Islamic-Iranian values by synthesizing appro...
متن کاملA Corpus-based Analysis of Epistemic Stance Adverbs in Essays Written by Native English Speakers and Iranian EFL Learners
Academic essays entail taking a stance on the truth value of propositions. Epistemic adverbs deal with the speaker's assessment of the truth value of propositions. Employing a corpus-based approach with descriptive statistics and qualitative description, this study explored the use of epistemic stance adverbs in academic essays written by native English speakers and Iranian EFL learners. Follow...
متن کاملProduction of English Lexical Stress by Persian EFL Learners
This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...
متن کاملNeural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003